Use one RNG per Event Producer #192

carlaKC · 2024-07-10T18:21:48Z

This PR updates #171 to create one RNG per event producer to decrease the variance across runs of the simulator when running with a fixed seed. This approach was originally taken in #171, but each RNG was seeded with the same value which led to nodes with the same capacity having the exact same payment flows (not what we want). I suggested that we share a RNG to get around that, but it introduces other problems - specifically that all tasks in the simulator are vying for mutex access to the same RNG, so the order of operations highly effects the determinism of the simulator.

This PR takes a new approach, which "salts" the seed with material from each node's public key to make the RNGs per-task deterministic but different.

Two design decisions that I wasn't totally happy with:

We use a single NetworkGenerator because it holds the whole graph, so we can't pass a unique RNG in for each event producer. The solution here is to include the RNG in the API of the network generator in each call:

It seemed inconsistent to do this for network generator and not PaymentGenerator (though we could, because this one is generated per event producer).
We end up passing in unnecessary RNG to the DefinedPaymentGenerator implementation of PaymentGenerator where we don't actually need it.

We seed the RNG with Option<u64> and then salt it with u64, this should probably by an Option<u64, u64> because we don't need the salt if we don't have a seed, but it seems unnecessary to unwrap and re-wrap (got lazy, might fix).

To get an idea of the improvement, I ran the simulator 50x with the two approaches and measured the coefficient of variance (mean / std dev) for the revenue that one node has in the simulation.

Old approach: CoV = 0.5
New approach: CoV = 40

A larger number indicates that our variance is lower, so this is quite a dramatic improvement which IMO makes it worthwhile to make the API changes.

Nodes and activity are not included here, because we want to be able to create our config struct separately from parsing of nodes and activity. This reasoning will become more apparent in future commits when we add two different running modes for the simulator (regular operation and simulated network mode).

Preparation for further refactoring where we'll remove the mutex.

carlaKC · 2024-07-10T18:23:50Z

@enigbe while using the simulator with fixed seed I noticed that the variance issues that you pointed out get much worse with larger networks (makes sense, there are more events taking place so we have more variance).

I should have thought about this option during review when you had the original version with one rng per producer (as I've done here), sorry! Took me running this to get more of an intuition into where we need to fix things up.

enigbe · 2024-07-12T06:22:35Z

@enigbe while using the simulator with fixed seed I noticed that the variance issues that you pointed out get much worse with larger networks (makes sense, there are more events taking place so we have more variance).

I should have thought about this option during review when you had the original version with one rng per producer (as I've done here), sorry! Took me running this to get more of an intuition into where we need to fix things up.

Please don't apologize. It was fine implementing it in the way you suggested (helps me learn more about the project). I will review this today/tomorrow.

enigbe · 2024-07-16T16:00:38Z

sim-lib/src/lib.rs

+        // generator will get the same set of values across runs (with a shared RNG, the order in
+        // which tasks access the shared RNG would change the value that each generator gets). We
+        // "salt" the set seed with a value based on the node's public key so that each generator will
+        // start with a different seed, but it will be the same across runs. It is not critical is the


nit: should be if. (... not critical if ...)

enigbe

Apart from a single nitpick this looks fine. Happy to take another look when you think this is ready.

As-is, we are sharing a single RNG for all of our random operations - destination selection for each payment, and selection of payment parameters per-sending node. All of these operations may complete in in different orders every time we run (be it because a task gets access to a mutex in a different order, or a payment is faster or slower to dispatch on the actual underlying node). This leads to large variance in our supposedly deterministic run, because the values we get from our "fixed" seed are different based on this order of operations. To address this, we use separate RNGs per-task, which reduces the variety in order of operations impact on the values that each task gets from its RNG. We can't do this for network_generator (because we use one for all of our tasks), so we take the approach of passing a single RNG into both our Payment and Network generator. This is admittedly a little clunky API-wise, IMO a great improvement in the feature. Testing this change against the original approach with: - 50x runs of the simulator - Sum up total revenue for a target node per run - Look at the coefficient of variance (mean / std dev) for each data set Results: - Old approach: CoV = 0.5 - New approach: CoV = 40 This increase indicates a large *decrease* in the variance in the revenue of a single node across runs (which is what we want, more deterministic outcomes with a fixed seed).

To get some variety in our simulator, we update our RNG that's based on the same set seed to also use some data from nodes public keys. This will be the same across runs, but provides different values for each node.

carlaKC · 2024-11-12T20:31:40Z

Closing due to (my own) inactivity, I'll open it back up when I've got some capacity for this.

Would still like to find an approach that doesn't require carting a rng around our traits, but couldn't figure that out when I looked at it last.

carlaKC added 2 commits July 10, 2024 14:05

sim-lib/refactor: rename MutRng to RngSend

9bae9f2

Preparation for further refactoring where we'll remove the mutex.

enigbe reviewed Jul 16, 2024

View reviewed changes

carlaKC added 2 commits October 16, 2024 09:26

sim-lib: make rng unique but deterministic per sending node

082a6db

To get some variety in our simulator, we update our RNG that's based on the same set seed to also use some data from nodes public keys. This will be the same across runs, but provides different values for each node.

carlaKC force-pushed the 171-decreasevariance branch from b24bb78 to 082a6db Compare October 16, 2024 13:28

carlaKC closed this Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use one RNG per Event Producer #192

Use one RNG per Event Producer #192

carlaKC commented Jul 10, 2024

carlaKC commented Jul 10, 2024

enigbe commented Jul 12, 2024

enigbe Jul 16, 2024 •

edited

Loading

enigbe left a comment

carlaKC commented Nov 12, 2024

Use one RNG per Event Producer #192

Use one RNG per Event Producer #192

Conversation

carlaKC commented Jul 10, 2024

carlaKC commented Jul 10, 2024

enigbe commented Jul 12, 2024

enigbe Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

enigbe left a comment

Choose a reason for hiding this comment

carlaKC commented Nov 12, 2024

enigbe Jul 16, 2024 •

edited

Loading